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Abstract 

Time of flight cameras may emerge as the 3-D sensor 
of choice. Today, time of flight sensors use phase-based 
sampling, where the phase delay between emitted and re¬ 
ceived, high-frequency signals encodes distance. In this pa¬ 
per ; we present a new time of flight architecture that relies 
only on frequency—we refer to this technique as frequency- 
domain time of flight (FD-TOF). Inspired by optical coher¬ 
ence tomography (OCT), FD-TOF excels when frequency 
bandwidth is high. With the increasing frequency of TOF 
sensors, new challenges to time of flight sensing continue to 
emerge. At high frequencies, FD-TOF offers several poten¬ 
tial benefits over phase-based time of flight methods. 

1. Introduction 

3D cameras are designed to capture depth of objects over 
a spatial field. The recent emergence of full-framerate 3D 
cameras such as the Microsoft Kinect have unlocked a num¬ 
ber of applications in computer vision, computer graphics, 
and beyond. New applications for 3D cameras continue to 
surface, which places pressure on the demand for faster, 
more accurate, more flexible 3D systems that can operate 
in the wild. 

Today, there exist a large number of technologies which 
can acquire 3D information. Ongoing research efforts in 
depth sensing tend to exist as independent cells, e.g., theory 
from time of flight, which relies on timing optical echoes 
is largely disparate from that of stereo vision, which uses 
view-dependent parallax. In this paper we bring together 
two depth sensing technologies: time of flight cameras and 
optical coherence tomography. The former is embedded in 
Kinect cameras and the latter is an interferometric technique 
used to see microscopic details in 3-D structure. 

In this paper we restrict our scope to phase-based, time 
of flight (TOF) cameras. In such devices an active light 
source strobes in a coded pattern and the optical signal that 
returns to the focal plane is characterized by a shift in phase, 
which corresponds to the depth of an object. Phase-based 
TOF forms the basis for several popular cameras in machine 
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vision, including the Microsoft Kinect, Mesa Swissranger, 
and PMD Camboard Nano. 

In all current implementations of phase TOF sensors, the 
phase is required to measure depth. Therefore, accurate 
measurement of depth relies on the accurate estimation of 
phase. This is challenging, for instance, in environments of 
multipath interference. An interesting question is whether 
it is possible to use the same sensor to obtain depth without 
using the phase? We answer this question by generalizing 
the specific theory of phase based time of flight sensors to 
the broader domain of optical coherence tomography. 

While phase TOF sensors operate on meter-size scenes, 
optical coherence tomography (OCT) works on small 
scenes and can obtain micron resolution of a 3D surface. 
OCT is often used in high-quality imaging of the human 
retina to diagnose glaucoma. There are two types of OCT: 
time-domain (TD-OCT) and frequency domain (FD-OCT). 
It turns out that phase TOF cameras, such as the Kinect, are 
using the same theoretical machinery from TD-OCT. In this 
paper, we are the first to bring the theory of FD-OCT to time 
of flight cameras. 

There are several benefits of our proposed technique of 
frequency-domain time of flight (FD-TOF). To obtain high 
accuracy depth maps it is necessary to increase the modu¬ 
lation frequency of TOF cameras. However, increasing the 
modulation frequency introduces several new challenges for 
the current, phase-based TOF architecture including phase 
wrapping and requirements on the phase shift resolution. 

To overcome these challenges we propose a frequency- 
based architecture that draws inspiration from the domain 
of optical coherence tomography. This technique is notable 
in that it does not require sampling in phase can therefore 
be implemented on either a TOF sensor or a conventional 
CCD/CMOS camera. 

2. Related Work 

Time of flight imaging is an emerging field in computa¬ 
tional imaging. This paper strives to be self-contained: sec¬ 
tion [3J] reviews time of flight, 3-D cameras from first prin¬ 
ciples. We use the term time of flight to refer to the time it 
takes a photon to travel a given distance through a medium. 
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Figure 1. We introduce a new architecture for depth sensing: frequency-domain time of flight (FD-TOF). This concept is analogous to 
frequency-domain optical coherence tomography (FD-OCT). In particular, both techniques encode optical time of flight in the frequency 
of the received waveform. For short optical paths (top row), the received signal in the primal-domain is lower in frequency than that of 
longer optical paths (bottom row). 


Work in computer vision and graphics centers around light 
transport analysis and algorithms as well as the capture of 
time of flight. 

Time of flight analysis represents an increasingly popu¬ 
lar area within the vision and graphics communities. For 
instance, Raskar and Davis ED provided structure to the 
analysis of transient light transport by defining the 5-D light 
transport matrix. For a more complete overview of temporal 
light transport the reader is encouraged to review the work 
of Kutalakos (33), OToole (28), Gupta ED and Wu ED- 
Time of flight analysis is closely linked to theory from to¬ 
mography El El US E), compressive imaging El 01 and 
coded exposure imaging EiEnum. Combining transient 
light transport with such frameworks has spurred several 
applications, including ultra-fast imaging of light in flight 
EannicEiicEiE), scattered light imaging [36, 13], mea¬ 
surement of BRDFs G2, new models for rendering [20], 
and medical imaging [ 32). A summary of space-time cod¬ 
ing can be found in a recent report from Mitra et al. [26]. 

Time of flight capture refers to the design and implemen¬ 
tation of systems that can capture the time that a photon is 
in flight. Such devices always include an active light source 
and the goal is to measure the delay between light emis¬ 
sion and detection. As the speed of light is fast—light trav¬ 
els one foot in a nanosecond—this seems like a challeng¬ 
ing task. Fortunately, advances in opto-electronics have en¬ 
abled wide accessibility to phase TOF cameras like the Mi¬ 
crosoft Kinect. These cameras operate by strobing a mod¬ 
ulated light source with a reasonable pattern, like a high- 
frequency square or sine wave. The signal that returns to 


the camera exhibits a depth-dependent phase shift, which 
is measured using time-correlation. In this paper, we break 
from the phase TOF paradigm to obtain depth using fre¬ 
quency alone. 

Multifrequency time of flight cameras offer an addi¬ 
tional set of measurements that have been used to improve 
the operation of phase TOF cameras. Recent work has used 
multifrequency measurements to address global illumina¬ 
tion (cf. |25l|29l|6l[l2)), perform phase unwrapping or de¬ 
multiplex illumination ED Throughout these papers, the 
same phase TOF architecture is present, i.e., the received 
signal is time-correlated with a reference. In this paper we 
propose a new architecture that receives samples only in the 
frequency domain. This technique offers a new way to over¬ 
come challenges such as global illumination or phase un¬ 
wrapping. 


Optical coherence tomography is an optical interfero¬ 
metric technique that can obtain 3-D structures at micron 
resolution m. These devices are used extensively in 
biomedical imaging of structures such as the human retina. 
There are two main classifications: time domain OCT (TD- 
OCT) and frequency domain OCT (FD-OCT). The former 
is very similar to phase-based TOF cameras: the received 
signal is time-correlated with a reference signal to deter¬ 
mine the phase offset. In contrast, frequency-domain OCT 
does not time correlate the received signal. Instead, the 3-D 
shape is obtained only by illuminating the sample at multi¬ 
ple optical frequencies. In this paper, we transpose the ideas 
from FD-OCT to the realm of time of flight 3-D cameras. 






































3. Preliminaries 


We begin our discussion of time of flight from first prin¬ 
ciples. Specifically, we describe the basic principles of 
phase TOF cameras, such as the Microsoft Kinect (Section 
HD- Then we provide a condensed overview of optical co¬ 
herence tomography in Section 3.2 We omit details of OCT 
that are specific to the optics community and thus less rel¬ 
evant to time of flight imaging. For a much more in-depth 
overview of OCT, the reader is encouraged to consult lfl6l 
and (9). 


Now, the primal-domain has changed from time, to r. This 
plays a key role, as it is physically easier to sample r at 
nanosecond timescales. Specifically, this is done by in¬ 
troducing a nanosecond buffer between emitted and refer¬ 
ence signals. To recover the phase and amplitude from the 
received signal, TOF cameras capture N samples in the 
primal-domain and, in software, compute an 7V-point dis¬ 
crete Fourier transform (DFT). Suppose that four evenly 
spaced samples are obtained over the length of a period, 
for instance, r = [0, 7r, ^ L ] T . Then the calculated phase 

can be written as 


Terminology: We use the term primal-domain to refer to 
the original frame that a signal is sampled in. We use the 
term dual-domain to refer to the frequency domain with re¬ 
spect to the primal. 

Note: To simplify exposition, all equations are provided 
in the context of a single scene point. The equations are 
spatially invariant. 

3.1. Phase TOF architecture 

A phase TOF camera is able to measure the phase delay 
of optical paths and obtain depth through the relation 

<i=2/2 - a) 

where z is the total path length of a reflection, d is the depth, 
/m is the modulation frequency of the camera and c is the 
speed of light. The modulation frequency of the camera 
is around 100 MHz, which corresponds to a period of 10 
nanoseconds. To estimate cp with high precision, a phase 
TOF camera contains an active illumination source that is 
strobed according to a periodic illumination signal. In stan¬ 
dard implementations (e.g. MS Kinect) the emitted signal 
takes the form of a sinusoid 

g(t) = cos (f M t) , (2) 

where to simplify notation, we assume the emitted signal 
has unit amplitude. At the sensor plane, the received optical 
signal can be written as 

s(t) = a cos (f M t +(fi)+/3, (3) 

where a is the attenuation in the projected amplitude and (3 
is the intensity of ambient light. It is assumed that parame¬ 
ters a , <p , and 0 are time invariant within the exposure time. 
To estimate these three parameters we would need to sample 
Equation [3] at least three times within the period. Such sam¬ 
pling in the time-domain is challenging as it requires short, 
nanosecond exposures. Instead a TOF camera computes the 
cross-correlation of the emitted and received signals: 


*--«“( &)-£) )• <5) 

and the calculated amplitude as 

a = - c fo)) 2 - (c(ti) - C(r 3 )) 2 . (6) 

The two real quantities of amplitude and phase can be com¬ 
pactly represented as a single complex number using phasor 
notation: 

M = (7) 

where M E C is the measured phasor, and j is the imagi¬ 
nary unit. Armed with the phase, a TOF sensor computes 
depth using Equation [T] and provides a measure of confi¬ 
dence using the amplitude. This concludes our overview of 
the standard phase TOF operation. 

Multipath interference: We now describe the common 
multipath interference (MPI) artifact that affects phase TOF 
sensors. In such cases, K reflections return to the imaging 
sensor and the received signal can be written as 

cmp(t) = i ai cos (f U T + tpi) J + /?, (8) 

where the subscript MP denotes multipath corrupted mea¬ 
surements. The received signal is now a summation of sinu¬ 
soids at the same frequency but different phases. Obtaining 
the direct bounce, i.e., K = 1, is a very challenging prob¬ 
lem. After simplification using Euler’s identity, the mea¬ 
sured amplitude and phase in the presence of interference 
can be written as 

, (Ya=\ UiSinipA 

<y?MP = arctan —^- (9) 
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Extracting the direct bounce from these corrupted phase and 
amplitude measurements turns out to be a challenging, non¬ 
linear inverse problem. The community has no clear so¬ 
lution, as all techniques to date are ill-conditioned j 6 [ 0 • 
Later in this paper, we return to the multipath problem, 
which allows separation of optical paths when using a 
frequency-domain architecture. 

Increasing the modulation frequency: Increasing the 
modulation frequency of TOF sensors allows for greater 
depth precision [24). Intuitively, at high frequencies, a 
small change in the estimated phase, corresponds to a small 
change in the estimated depth. Specifically, we can write 
the range accuracy A L as proportional to 

AL oc c/fu- (11) 

Over the past few years, time of flight cameras have sported 
increased modulation frequencies to boost depth precision. 
For instance, the new Kinect tripled the modulation fre¬ 
quency over its competitors, increasing the modulation fre¬ 
quency to about 100 MHz. However, this frequency is at the 
upper limit of what the phase TOF architecture can handle. 
In particular, phase wrapping will occur for scene objects at 
a depth greater than 

^ambiguity = c/2/m- (12) 

Unwrapping, or disambiguation of phase, requires more 
measurements in combination with a lookup table and is 
susceptible to noise l23l . Another challenge that emerges 
at high frequencies is an increase in the required sampling 
rate in the primal-domain. 


ence with a sample (to correlate in time, the reference sam¬ 
ple is shifted). This is precisely the mechanism of phase 
TOF described in section 13.11 The second class of tech¬ 
nique is Frequency-Domain OCT (FD-OCT), which per¬ 
forms sampling only in frequency domain. In this section, 
we provide a concise overview of FD-OCT, describing only 
the facets that can be applied to 3D cameras. 

In Frequency-Domain OCT, depth is obtained by sam¬ 
pling the signal at different optical wavelengths (i.e. wave¬ 
length is the primal-domain). Figure [ljprovides a schematic 
for the typical FD-OCT system. At a single wavelength, the 
detector receives an electric field from the reference object, 
which takes the form of 

R(A) = a(A)e iVR(A) , (13) 

where M(A) represents the received phasor as a function 
of optical wavelength. Similarly, the received electric field 
from the sample object is written as 

S(A) = a(A)e ivs(A) . (14) 

Note that the amplitude of the sample and reference are as¬ 
sumed to be equal, which simplifies our explanation of the 
concept. A combination of the two reflections return to an 
imaging sensor. The electric field at the detector is the sum¬ 
mation 

M (A) = 1 (R (A) + § (A)), (15) 

where it is assumed that the constituent phasors are halved 
when they recombine (for instance, due to a beamsplitter). 
The current measured at the detector can be expressed as 
the real quantity 


Example: Suppose the modulation frequency is 1 GHz, 
which corresponds to a period of 1 nanosecond. Two chal¬ 
lenges arise. First, from Equation [12] objects further than 
13 centimeters will cause phase ambiguities. Second, since 
N samples are computed within the period, this means that 
r has to be sampled at sub-nanosecond timescales, which is 
challenging to implement electronically. While using high 
modulation frequencies with phase TOF introduces new 
challenges, our proposed architecture is geared toward high 
modulation frequencies. 

3.2. Primer on Optical Coherence Tomography 

While phase TOF is a type of electronic interferometry, 
optical coherence tomography performs interferometry di¬ 
rectly on the optical signal. Specifically, depth is obtained 
with respect to a reference sample by correlating reflec¬ 
tions from the reference with the sample. OCT is divided 
into two factions, which are characterized by the type of 
sampling they use. The older, Time-Domain OCT (TD- 
OCT) technique uses time-correlations of an optical refer¬ 


i(A) = ^|M(A)| 2 , 


(16) 


where 77 is the detector sensitivity, q is the quantum of elec¬ 
tric charge, h is Planck’s constant, and v is the optical fre¬ 
quency. By substituting Equation [15] into 16 we obtain 
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Here, cp z represents the phase delay due to the difference 
in optical path length between the reference and optical re¬ 
flections. Similar to the TOF case, phase and z-distance are 
related: 

(p z = 2'kz/X. (18) 

In Equation [IT] note that the Autocorrelation terms are DC 
with respect to the wavelength. By using this relation along 
with Equation [18] we can rewrite Equation [TT] as 
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(19) 








FD-TOF: Depth encoded in frequency of received signal 
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Figure 2. FD-TOF recasts depth estimation as a frequency estimation problem. The received signal is plotted in primal-domain for three 
different depths. The closest object generates the lowest frequency signal in primal-domain, while the furthest object generates the highest. 


Now we introduce an auxiliary variable k = 2tt/X, which 
is known as the wave number. Equation [19] can be rewritten 
as 

i(k) = {a (k)) 2 (l+2 cos (kz)), (20) 

where now the primal-domain is the wavenumber (fc). The 
dual can be computed as 

T [i (fc)] (k) oc S (k) + 5 (k ± z) , (21) 

where k, represents the dual domain. To summarize: in 
frequency-domain OCT, reflections at multiple wavenum¬ 
bers are measured at the detector and depth is encoded in 
the frequency of the received signal in primal-domain. 

Note on Depth Resolution: The frequency bandwidth 
that FD-OCT can sample is quite large and is related to 
the difference between the largest and shortest wavelengths 
that are used to illuminate the sample. A large bandwidth 
allows for very fine axial, or depth resolution. Axial resolu¬ 
tion is different from depth precision—it refers to the min¬ 
imum separation between layers that can be resolved. For 
FD-OCT, the bound for axial resolution is complicated, and 
depends on the optical hardware. However, in section |4~4| 
we show that a similar derivation exists for time of flight 
cameras and can be expressed in a simpler form. 

4. Frequency Domain Time of Flight 

Inspired by Frequency-Domain OCT, we can reexamine 
the conventional operation of TOF cameras. In this sec¬ 
tion we provide a recipe for depth estimation by sampling 
different modulation frequencies at a single phase step. We 
refer to this new architecture as FD-TOF, where the primal- 
domain is modulation frequency. 


In standard phase TOF, r represents the primal-domain 
against which we would ordinarily compute the TV-point 
DFT. Instead, consider substituting Equation [T] into Equa¬ 
tion 0] to obtain 

c(t, Zm) = ^ COS ^/ m t + -^/m^ + P- (23) 

Without loss of generality assume that this signal is sampled 
only at the zero shift, i.e., r = 0. Then the received signal 
at the sensor takes the form of 

c{t = 0, /m) = c(/m) = - cos ( — /m J + P- (24) 

Now the primal domain is /m and the associated dual takes 
the form of 

f 2 7 T Z \ 

F[c(M] (k)(x6(k)+6U± — ) , (25) 

where k is the dual-domain, corresponding to the inverse of 
the modulation frequency. Analogous to FD-OCT the depth 
can be obtained by finding the location of the support in the 
dual domain. 

4.2. Multipath interference in FD-TOF 

An advantage of FD-TOF is that multipath interference 
is separable in the dual-domain. Recall that in the multipath 
problem, K reflections return to the sensor and the received 
signal is given by 

on cos (^/ M ) ^ + P- (26) 
The associated Fourier transform can now be written as 
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c (/m) = o D 2 
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4.1. Depth sensing using only modulation frequency 

In this section we show how depth can be calculated, not 
from phase steps, but from frequency steps. Recall from 
Section [3TT| that the received signal takes the form of 

OL 

C ( T ) = 2 cos (7 m t + V?) + P- (22) 


T 7 [c (/m)] («0 oc <5 (k) + ^ a i s ( K ±- -)• (27) 

i=i ^ c ' 

Here, the received signal is a sum of sinusoids at the same 
phase but at different frequencies. Assume for the moment 
that the signal is sampled adequately to avoid aliasing. Then 









Frequency Domain TOF (20 dB) Phase Domain TOF (20 dB) 


10 dB 0.7% 0.3% 

20 dB 0.7% 0.05% 

50 dB 0.6% 1 e-3% 
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Figure 3. FD-TOF estimates frequency in the primal-domain of /m, while phase TOF estimates phase in the primal-domain of r. At left, 
the primal-domain for FD-TOF and phase TOF at 20 dB SNR. The table at right lists percent error for different levels of SNR. FD-TOF 
performs better for depth recovery when the SNR is low and phase TOF performs better at high levels of SNR. 
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0.1 dB 5.3% 11% 

1.0 dB 1.2% 1.4% 

5.0 dB 0.8% 0.8% 


the different depths can be resolved by analyzing the loca¬ 
tion of spectral peaks in Fourier domain. For comparison 
in phase-TOF the received signal is a sum of sinusoids at 
the same frequency but different phases. Here, the inverse 
problem of recovering the constituent depths is ill-posed. 

4.3. Estimating frequency to obtain depth 

FD-TOF recasts depth estimation into the problem of fre¬ 
quency estimation in the received signal (Equation [27]). Fre¬ 
quency estimation of a discrete waveform is a well-studied 
problem in the digital signal processing community. While 
Fourier analysis is simple and well-known, there are sev¬ 
eral, more statistically-efficient ways to estimate the fre¬ 
quency. For our results, we use Quinn and Fernandes tech¬ 
nique [301, which is as efficient as the least squares estima¬ 
tor of frequency. 

There are several other ways to estimate the frequency, 
including Pisarenko’s method and Multiple Signal Classifi¬ 
cation (MUSIC). The reader is encouraged to see Stoica et 
al. El for an excellent overview of techniques. 

4.4. Frequency bandwidth and resolution 

We now turn to the concept of axial or depth resolution. 
The concept is used in OCT to characterize the minimum 
distance at which two layers can be separated. We adapt 
this concept to TOF sensors to characterize multipath in¬ 
terference. In this context, the modulation frequencies that 
are sampled can be described as a boxcar function in the 
primal-domain 

n (/m) =H(fu- /m) — H (/m — /£), (28) 

where /^ and /^ represent the minimum and maximum 
modulation frequencies that are sampled and iT(-) refers 
to the Heaviside step function. The Fourier transform of 
n (/m) takes the form of a sine function: 

rrn /; NW N a n sin A/m ft 

‘ME/m)] («) oc A/m-, 


where A/m = /m “ /m • The FWHM of this function de¬ 
termines the axial resolution A z, which is approximately: 

Az « 1.2c/A/ m . (30) 

In summary, if the frequency bandwidth A/m is large, it 
is possible to disentangle multipath and scattering. If the 
frequency bandwidth becomes sufficiently high, FD-TOF 
may find relevance in biomedical imaging of small-scale 
structures. 


Example: Current TOF sensors can support a frequency 
bandwidth of approximately 100 MHz. Using Equation |30| 
multipath interference can be disentangled in frequency 
domain if the optical paths differ by about 3.6 meters. If the 
frequency bandwidth increases by two orders of magnitude, 
then optical paths that differ by a few centimeters can be 
separated. 

5. Slow-TOF: Can we use a normal camera? 

Sampling the primal-domain in FD-TOF amounts to 
changing the strobing frequency of the light source. Since 
only the light frequency is changing, a reasonable question 
is whether it is possible to use a conventional camera (here¬ 
after “slow camera”). We refer to this proposed technique 
as Slow-TOF. Recall that a slow camera integrates the pho¬ 
ton flux over an exposure time. The image formation model 
in the primal domain is given by 

r tE f 97 tz \ 

I(Jm) = J acos f f M t + — /m \ +/3dt, (31) 

where t E denotes the exposure time. After expanding the 
integral, the measured intensity can be written as a summa¬ 
tion of two sinusoids: 

I(fu) = / 

JM 
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(29) 


AC component 


(32) 















As expected, the AC component of Equation [32] evaluates 
to zero when the exposure time is zero. However, if the 
exposure time is greater than zero, then the measured im¬ 
age intensity is a summation of two sinusoids. The Fourier 
transform of Equation [32] with respect to /m is proportional 
to 

T [c (/ M )] (*) oc 6 («) + S (k ± + t E )) + S (k ± &*) . 

(33) 

As before, the depth can be calculated by finding the loca¬ 
tion of support in the dual domain. In practice, Slow-TOF 
introduces two additional challenges: 

• The AC component of the received signal decays at 
high modulation frequencies, and 

• The exposure time needs to be consistent during the 
sampling process. 

The first point is clear from the form of Equation [32] where 
the amplitude is inversely proportional to the frequency of 
modulation. One way to address this challenge is to increase 
the amplitude of the light source. For the second point, 
if the exposure time is changing while sampling in fre¬ 
quency, then the sampling will not be meaningful. For this 
reason, implementing Slow-TOF with a consumer-grade 
DSFR may be challenging as the exposure is not consis¬ 
tent. We believe a successful implementation of Slow-TOF 
would need to use a machine vision camera. 

6. Experiments 

To validate that depth can indeed be encoded in fre¬ 
quency, we perform real and simulated experiments. 

6.1. Simulated Experiments 

FD-TOF: Noiseless Simulation To illustrate the basic 
concept we first consider a noiseless simulation. Figure [2] 
plots the received signal in the primal domain (i.e. modula¬ 
tion frequency) for three different scenes. The modulation 
frequency ranges from 10 MHz up to 1 GHz and the three 
curves correspond to a single object placed at 1, 2 and 3 me¬ 
ters. When the object is at the 1 meter position, the received 
signal is a low frequency sinusoid (red curve). In contrast, 
when the object is placed further away, e.g., at 2 or 3 meters, 
the received signal doubles or triples in frequency. 


Ground Truth Scene 



FD-TOF Reconstructions 



Figure 4. Recovery of complex scenes. Here, we render a scene 
with a dragon exhibiting scattering and interreflections. The first 
row shows ground truth amplitude and depth images (colorbar rep¬ 
resents depth in meters). We are able to reconstruct the dragon at 
20 dB and 1 dB SNR. 

the goal of FD-TOF is to measure the frequency of the re¬ 
ceived, noisy signal (plotted at 20 dB SNR). For compari¬ 
son, in phase domain TOF the primal-domain is r, the phase 
shift. The goal of this architectures is to measure the phase 
shift in the primal domain between a noisy, received signal 
with a reference signal. As illustrated in the table of Figure 
[3j the percent error of phase and frequency domain TOF is 
comparable at different levels of SNR. As a general trend, 
FD-TOF has more robust performance at very low levels of 
SNR, while phase domain TOF has improved performance 
at high levels of SNR. To summarize: both techniques are 
robust to noise, however FD-TOF performs better in cases 
of low SNR. 


FD-TOF: Performance with Noise A key benefit of fre¬ 
quency estimation lies in its robustness to noise. If the re¬ 
ceived signal consists of a single tone, recovering the fre¬ 
quency of the sinusoid is a problem with robust guaran¬ 
tees. We compare our proposed, frequency domain TOF 
technique with the conventional phase domain TOF tech¬ 
nique in the presence of noise. As illustrated in Figure [3] 


FD-TOF: Complex object To analyze the performance 
of FD-TOF on a complex scene, we use the Mitsuba soft¬ 
ware Jl9l to render a model of a Dragon. This scene con¬ 
sists of a Fambertian dragon with a small amount of scat¬ 
tering due to interreflections. The ground truth amplitude 
and depth images are illustrated in the first row of Figure [4] 











Figure 5. It is possible to use a conventional camera to recover time 
of flight. In FD-TOF the primal-domain is the modulation fre¬ 
quency. By strobing the illumination at different modulation fre¬ 
quencies, the received signal at the slow camera exhibits a depth- 
dependent frequency. In the plot at right, the furthest object has 
the highest frequency in the primal domain, while the closest ob¬ 
ject exhibits the lowest frequency. Note the decay in amplitude of 
the signal as the modulation frequency increases. 


and reconstructions at 20 dB and 1 dB SNR occupy the sec¬ 
ond and third rows of the figure. 20 dB SNR is a plausible 
value for a real-world camera system. At this noise level the 
dragon is reconstructed accurately. Even at 1 dB SNR, the 
depth reconstruction has a PSNR of 7.25 dB. 

Slow-TOF: Simulations Recall that in Section]?] we sug¬ 
gested that FD-TOF can be implemented on a slow camera. 
As illustrated in Figure [5] the proposed architecture consists 
of the same coded, strobing that is used on TOF sensors. 
However, unlike a TOF sensor, a regular DSFR integrates 
the photon flux over an exposure time. This integrated value 
is plotted for different strobing frequencies in Figure [5] Ob¬ 
serve that the objects at 1 , 2 and 3 meters have distinct 
curves in the primal-domain, which can be separated. Note 
that at higher modulation frequencies the signal amplitude 
decays (cf. Equation [32]). In order to glean information 
from the higher frequency bands it is necessary to use a light 
source with a large amplitude. 

It is interesting to note that, in the simulation it is pos¬ 
sible to distinguish nanosecond delays of light travel (light 
travels 1 foot in a nanosecond). This result implies that with 
a slow camera like a DSFR, it is theoretically possible to 
capture nanosecond phenomena. 


We add a standard DSFR lens (Canon EF-S 18-55 mm), 
a clock generator (Stanford Research DS345) and a light 
source bank (twelve 850 nm FEDs). The clock generator 
provides the reference code to the camera, which in turn 
provides the strobing pattern to the light bank. By placing a 
photodiode in front of the strobing light source we are able 
to verify the coded strobing pattern on an oscilloscope. The 
hardware implementation is shown in Figure [6] 

FD-TOF reduced to practice: As illustrated in Figure 
[6] the scene consists of a white wall approximately 1 me¬ 
ter away from the camera. For the foreground object, we 
place a sheet of paper approximately 80 centimeters away 
from the camera. In Figure [6] we plot the received signal in 
the primal-domain for a foreground and background pixel. 
When plotting each pixel in primal-domain, we observe that 
the further away pixel exhibits a higher frequency. Since the 
primal-domain ranges from only 10 MHz to 30 MHz, we 
observe barely one cycle in primal-domain, and the change 
in frequency is harder to detect. This factor, combined with 
the low signal from portions of paper affect the quality of 
the full depth reconstruction where part of the paper is not 
rendered correctly. Nevertheless, the implementation vali¬ 
dates our core idea: that depth can be encoded, not in phase, 
but in frequency. 

7. Discussion 

In summary we present a new architecture for time of 
flight 3D imaging by changing the primal-domain. Instead 
of relying on measuring phase shifts, we recast the entire 
problem in the frequency domain. As the modulation fre¬ 
quency of TOF sensors continues to increase, the benefits 
of frequency-domain OCT will start to become accessible 
to TOF sensing. At GHz or THz frequencies, our proposed 
architecture will obtain micron resolution. 

7.1. Overview of Benefits 

The benefits of FD-TOF include 

• Robustness to phase and multipath challenges, 

• Compatibility with slow cameras, and 

• A framework for flexible interferometry. 


6.2. Physical Experiments 

Implementation: To implement FD-TOF the frequency 
of the TOF sensor and light source must be sampled with 
high granularity. The Mesa Scientific Fock-in Module 
(SEIM) is available for purchase from Mesa Imaging in 
Zurich. This camera is essentially a bare TOF sensor that 
accepts a range of clock signals from 10 MHz to 30 MHz. 


Phase and multipath challenges: Today, TOF sensors 
are at a critical point where phase wrapping is a real chal¬ 
lenge. Consider the following: a 100 MHz time of flight 
camera has an ambiguity distance of only 1.33 meters, 
which necessitates phase unwrapping techniques. While 
there are benefits to increasing modulation frequency, it is 
not clear if phase TOF will be fully compatible (phase un¬ 
wrapping algorithms are susceptible to noise). In addition, 
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Figure 6. Validation of FD-TOF with a real experiment. At left is the hardware prototype—a TOF camera that can be customized to accept 
arbitrary clock signals. The oscilloscope shows the strobing pattern of the light source in time. The scene is shown in the center and 
consists of a wall at 1 meter and a sheet of paper at 80 cm. The blue point is further away and has a slightly higher frequency than the red 
point when plotted in the primal-domain. 


at high frequencies, sampling in phase needs to be very pre¬ 
cise, down to picoseconds. Finally, in phase TOF, resolv¬ 
ing multipath interference is a challenging, non-linear in¬ 
verse problem. With sufficient frequency bandwidth, the 
proposed FD-TOF architecture is robust to such challenges. 

Slow cameras can measure TOF: We have shown that, 
in theory, time of flight can now be captured with a con¬ 
ventional, slow camera. This is enabled through the archi¬ 
tecture of FD-TOF, where sampling in the primal-domain 
amounts to changing the frequency of the light source. Our 
analysis suggests that a slow camera, equipped with mil¬ 
lisecond exposures, has the potential to capture time of 
flight to nanosecond precision. We believe that the imple¬ 
mentation of such a device would lead to interesting follow¬ 
up work. 

Flexible Interferometry: In this paper we establish a du¬ 
ality between electronic and optical interferometry. The for¬ 
mer technique performs correlation on the modulated sig¬ 
nal, while the latter correlates the carrier signal. In some 
cases, electronic interferometry might be preferable to op¬ 
tical inteferometry. For instance, today, it is challenging 
to perform optical interferometry at GHz or THz frequen¬ 
cies due to well-known optical challenges (see ’’Terahertz 
Gap”). The architecture we have proposed is a step toward 
increasing the flexibility of interferometry. 

7.2. Limitations 

The proposed technique is not well-suited to available 
TOF hardware. First, current TOF sensors are designed to 
sample in phase, not in modulation frequency. Second, to¬ 
day’s sensors only have about 100 MHz of frequency band¬ 
width, which means that not enough cycles of the sinusoidal 
signal will be observed in the primal-domain. In the absence 
of sufficient cycles, frequency estimation becomes a more 
challenging problem. Finally, we note that accurate tone 


estimation requires sampling a number of frequencies. Al¬ 
though in theory, a minimum of 3 frequencies are required, 
in the presence of noise more may be required. Follow-up 
work would explore tradeoffs in frequency sampling (e.g., 
how many frequencies are required, which frequencies are 
optimal). 

8. Conclusion 

We have demonstrated FD-TOF, a new architecture for 
TOF sensors. As modulation frequencies continue to in¬ 
crease, phase TOF cameras face new challenges, such as 
phase wrapping and phase stepping. Our proposed system 
may represent a step toward high-frequency, electronic in¬ 
terferometry. 
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